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ABSTRACT 



The reauthorization of the Individuals with Disabilities 
Education Act (IDEA) in 1997 clarified that special education was to fully 
participate in educational accountability systems related to standards-based 
reform. Special education students could participate in the general 
assessment, with or without accommodations, or in an alternate assessment. 
IDEA did not specify the content or form of alternate assessment, but 
required that a system for alternate assessment be in place by July 2000 and 
that results of alternate assessments be reported with the same frequency and 
in the same detail as general assessment results. Inclusion of special 
education students in statewide testing ensures that they remain visible in 
subsequent decision making about policy and resource allocation. This paper 
discusses issues in the development of alternate assessments: (1) its purpose 
as an evaluation of the educational system's performance with regard to 
students with severe disabilities; (2) the need for states to have 
broad-based, inclusive academic standards that can be linked to the 
fiinctional curricula of special education; (3) eligibility criteria based on 
disability classification or curricular focus; (4) forms of alternate 
assessment (portfolio assessment, checklists or rating scales of functional 
skills, lEP analysis) ; (5) who scores alternate assessments; and (6) methods 

of reporting results. (Contains 15 references.) (SV) 
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ALTERNATE ASSESSMENT: NO CHILD LEFT BEHIND DURING STATEWIDE TESTING 

Standards-based reform has swept across our nation’s educational system over the past decade. On both 
national and state levels, data have been collected, often through wide-scale testing, to help determine the current 
state of our educational system, identify problems, and develop plans for continued improvement (Vanderwood, 
McGrew, & Ysseldyke, 1998). The reau^orization of the Individuals with Disabilities Education Act (P.L. 105-17, 
1997 [IDEA 97]) clarified that special education was to fully participate in these educational accountability systems. 
IDEA 97 mandated that all students with disabilities participate in statewide testing and that those students who are 
unable to participate in the general assessment system, even with appropriate accommodations, must participate 
through an alternate assessment. The Individualized Education Program (lEP) team was given the responsibility of 
determining the most appropriate of the following three options: (1) the student would participate in the general 
assessment without accommodations; (2) the student would participate in the general assessment with 
accommodations; or (3) the student would complete an alternate assessment. IDEA 97 did not mandate the content 
or form of the alternate assessment, but required that a system for alternate assessment be in place by July 1 , 2000. 
The Act specified that results of the alternate assessments must be reported with the same frequency and in the same 
detail as assessment results for students without disabilities. 
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Schools across the nation have responded to the call for reform by collecting data to measure progress 
toward educational goals. Until recently, however, results for students with disabilities were rarely included in such 
data collection, as these students were typically excluded from statewide testing (Vanderwood et al., 1998). 
Ysseldyke and Olsen (1999) emphasized the importance of the inclusion of students with disabilities in statewide 
assessment when they stated: “It has been argued that when students with disabilities are out of sight in assessment 
and accountability systems they are out of mind when policy decisions are made and when educational structures 
and programs are designed” (p. 175). This point was brought out as well by the title of an article by Burgess and 
Kennedy (1998): What Gets Tested, Gets Taught, Who Gets Tested, Gets Taught. Excluding students from statewide 
testing may also result in denying them the anticipated benefits of accountability (e.g., higher expectations, 
increased student performance). Thurlow, Elliott, and Ysseldyke (as cited in Thompson, Quenemoen, Thurlow, & 
Ysseldyke, 2001), discussed the impact of including/excluding students with disabilities in statewide testing. Three 
important arguments for including all students in the assessment process emerged from their discussion: (1) all 
students must be included to obtain an accurate picture of the educational system (i.e., accurate comparisons across 
schools, districts, and states cannot be made if some districts assess all students and some assess only a portion of 
their students); (2) excluding students from assessment because they are not expected to do well may lower 
expectations for student achievement; and, perhaps most importantly, (3) policy decisions and resource allocation 
may be based on the results of the assessments, and if students with disabilities are not represented in the reported 
results, their performance will not influence the decision-making. 

It is critical that the educational needs of students with severe disabilities are not overshadowed by the 
needs of the much larger number of students participating in the general statewide testing. When designed and 
implemented to hold the educational system accountable for positive outcomes for students with disabilities, 
alternate assessment systems may become a powerful tool for system improvement. This paper presents a review of 
the progress made in the development of alternate assessments. Although states have followed different paths in 
creating alternate assessments, any approach must address the issues of assessment purpose, standards/content, 
eligibility, form, scoring, and reporting. Discussion of these issues forms the remainder of this paper. 

Assessment Purpose 

Alternate assessments may be used to measure student performance, system performance, or both. Kleinert, 
Haig, Keams, and Kennedy (2000) noted that statewide assessment systems (including alternate assessments) may 
be thought of more as a matter of school accountability than of student accountability. Clearly, schools must be held 
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accountable for providing the opportunities and appropriate resources for students to achieve the standards and goals 
upon which they are assessed. Measuring student performance is not independent from measuring system 
performance, as student performance informs the discussion of system performance. An alternate assessment system 
can be seen as a means of evaluating the degree to which the educational system is meeting the needs of students 
with severe disabilities (i.e., a basis for drawing conclusions about the educational system’s performance). 
Information about each student’s individual performance may be valuable in determining how well that student’s 
needs are being addressed, and aggregated information may be helpful for decisions made on a systemic level. 
Analysis of the assessment results should indicate programmatic strengths and weaknesses. If, for example, the 
assessment includes a measure of generalization, but analysis of the assessment results indicates the students are 
unable to demonstrate generalization (due to lack of opportunity it appears, as all instruction took place in the 
special education classroom), this indicates a system weakness. The power of such assessment systems to drive 
improvement in educational systems would be demonstrated should such a scenario result in the school revising its 
instructional practices and placing a greater emphasis on generalization of skills. How states will use the results of 
alternate assessments remains to be seen, but the opportunity exists to develop this process into a powerful tool for 
system improvement. 

Standards 

Standards refer both to what students should know (i.e., content standards) and how well they should know 
it (i.e., performance standards). The instructional goals and objectives for students with severe disabilities are 
usually quite different from those reflected in statewide testing. General education assessment systems typically 
address standards in the areas of language arts, mathematics, social studies, and science (Ford, Davem, & Schnorr, 
2001). Functional goals such as dressing oneself, safely crossing the street, or using public transportation do not fit 
easily under such standards. However, creating a separate set of standards for students with disabilities raises the 
issue of creating separate educational systems (Ysseldyke, Olsen, & Thurlow, 1997). IDEA 97 made it clear that 
students with disabilities must have access to the general education curriculum, and the Improving America’s 
Schools Act of 1994 (P.L. 103-382) required that standards apply to all students, including those with disabilities. 
These Acts would seem to discourage separate standards. 

In an effort to resolve the dilemma of applying high standards to all students, states have used a variety of 
strategies for describing the relationship between the functional curriculum accessed by students with severe 
disabilities and the general standards. Most states described their alternate assessments as being based on exactly the 
same standards as the assessments for students without disabilities (or a subset of those standards), and have 
expanded the standards to include functional skills as indicators (Thompson et al., 2001). Ford et al. (2001) 
discussed some of the problems involved in expanding the standards. Tliey noted that general standards are typically 
expanded by either simplifying or redefining the existing standards. Simplifying may result in documentation of 
minor participation in classroom activities (e.g., citizenship standard simplified to drawing a flag), and redefining 
may result contrived connections to the standard (e.g., historical perspective standard redefined to using a daily 
schedule). Either of these approaches may lead to educators spending an inordinate amount of time connecting the 
functional curriculum to standards that are not truly related. The time may be better spent developing inclusive 
standards that take into account the functional curriculum that is necessary for these students. 

Regardless of how the standards are described, the crux of the issue is that the standards that form the basis 
of the alternate assessments must reflect the curricular domains typically accessed by this student population (e.g., 
functional academic, communication, social, self-care, and vocational skills). Without this fundamental connection 
between the assessment and the curriculum, the assessment results will not support informed decision-making 
regarding how well special education programs are meeting the needs of students and what system improvements 
are indicated. States that have developed broad-based standards (e.g., students will use mathematical concepts to 
solve problems in daily life) find that they are more easily applied to students with disabilities than more narrowly 
defined standards (e.g., students will master algebraic functions). 

Eligibility 

Eligibility is the determination of which students should participate in the alternate assessment. It is 
expected that the majority of students with disabilities (98%) will participate in the general statewide testing, either 
under typical testing procedures or through the use of accommodations (Ysseldyke et al., 1997). The remaining 
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students (i.e., the 1-2% with severe disabilities) will participate in alternate assessments, as use of the general 
education tests, even with appropriate accommodations, would not yield meaningful or useful information. The 
students participating in alternate assessment are usually described as those having the most severe cognitive deficits 
or multiple disabilities (Ysseldyke & Olsen, 1999), students enrolled in self-care, life skills, or functional programs, 
and students not pursuing general education outcomes (Warlick & Olsen, 1999). IDEA 97 mandated that members 
of the lEP team (i.e., those deemed to know the student best) make the eligibility determination. It is important that 
alternate assessments are viewed as a valid assessment for a specific group of students and are not used to keep 
students, who for a variety of reasons are not expected to do well on the general assessment, from “bringing down” 
the general assessment results (Langenfeld, Thurlow, & Scott, 1997). 

Some state guidelines regarding eligibility focus on disability classifications (e.g., severely mentally 
impaired, multiply impaired, autistic), while others take a curricular focus (e.g., functional or lifeskills programs) 
(Warlick & Olsen, 1999). Many states also list criteria not to be used for the eligibility decision, such as academic 
delays due to excessive absences, lack of instruction, social or cultural factors, disruptive behavior, or expectation of 
poor performance (Thompson et al., 2001). Many factors must be taken into account to determine eligibility for an 
alternate assessment. Using only the student’s disability category, for example, may lead to students capable of 
taking the general assessment (e.g., some students categorized as having autism) who are instead participating in the 
alternate assessment, and making the determination based on the student’s curriculum may overlook the possibility 
that the student might benefit from more access to the general curriculum. States often list several criteria, all of 
which must be met in order for a student to be eligible for the alternate assessment. For example, Nebraska’s criteria 
include documentation that the student’s demonstrated cognitive ability and adaptive behavior prevent completion of 
the general curriculum even with appropriate modifications, that the student’s curriculum is primarily functional, 
and that he/she requires individualized instruction to acquire, maintain, and generalize skills (Hill, Bird, & 

Dughman, 2000). 

Forms of Assessment 

Assessment systems must be developed to measure progress towards the standards. General education has 
relied heavily on standardized achievement testing to measure student progress (Elliott, Braden, & White, 2001). As 
noted previously, this type of assessment does not yield useful information for students with severe disabilities. A 
number of different forms for alternate assessment have been selected by the states. The National Center on 
Educational Outcomes identified four general categories: portfolio assessment, checklist/rating scale of functional 
skills, lEP analysis, and other. The term portfolio is used to refer to any collection of materials and/or data for a 
specific student, and varies a great deal from state to state. Portfolios may consist of any combination of the 
following: work samples, audio and/or video clips, anecdotal records, surveys, adaptive behavior checklists, 
attendance reports, daily schedules, data charts, and communication systems (Thompson & Thurlow, 2001; Warlick 
& Olsen, 1999). In addition to measures of student performance, access to multiple settings, interaction with peers 
without disabilities, skill generalization, use of natural supports, and use of age-appropriate materials/activities may 
be measured when evaluating the portfolios (Warlick & Olsen, 1999). 

A few states have created alternate assessments consisting of locally developed checklists or rating scales 
of functional skills, which are completed by teachers and/or lEP teams (Thompson & Thurlow, 2001). Domains 
assessed through these checklists include functional academics, communication skills, domestic skills, and 
vocational skills. Checklists and rating scales are less time consuming to complete than portfolios, and could be 
standardized, but vary in the amount and quality of information they yield. Progress could be charted through the 
use of rating scales and checklists that are administered semi-annually or annually. 

Five states are analyzing student lEPs as the alternate assessment. lEP goals are categorized into domains 
and assessed according to rate of progress and/or level of support required to achieve the goal (Thompson & 
Thurlow, 2001). Thurlow (2000) points out that measuring only progress on lEP goals may lower expectations and 
lead to the conclusion that any amount of progress is acceptable. The intention of the lEP process has always been 
that the goals would be assessed throughout the school year, thus this method is an extension of that procedure, and 
it may be difficult to make any kind of comparison across students or programs given that the goals are so 
individualized. States categorized as “other” reported using data from eligibility assessment, out-of-level testing, and 
assessments conducted by the lEP team (Thompson & Thurlow, 2001). 
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Scoring of Assessments 

Alternate assessments are scored by a variety of professionals under a variety of circumstances. States have 
developed systems in which teachers score the assessments of their own students, teachers score the assessments of 
students other than their own, state department of education staff score the assessments of all students, or a 
combination is employed, such as teachers and state department staff score the assessments and results are compared 
(Warlick & Olsen, 1999). Teachers in Kentucky found they had difficulty maintaining objectivity when scoring their 
own students’ portfolios (Kleinert, Kearns, & Kennedy, 1997). Training for those scoring alternate assessments, 
scoring assessments for students of other teachers, and having more than one person score each assessment may help 
overcome the issue of objectivity. 

Reporting 

IDEA 97 requires that participation and performance results of alternate assessments be reported, but does 
not provide specific instruction for doing so. States are therefore left to determine the most appropriate methods of 
reporting results, and to make decisions about whether to aggregate the results of alternate assessments with the 
results of the general assessments. The issue of aggregation of results is further complicated by the performance 
descriptors used to summarize assessment results. General education accountability assessments are typically 
summarized by classifying results into descriptive categories such as mastery, near mastery, and partial mastery. 
Alternate assessment systems have followed this lead, however they have not always used descriptors that match 
those used for the general assessment in their state. Using different descriptors would make aggregation with the 
scores of the general assessment difficult, while using the same descriptors allows all “mastery” scores (general or 
alternate) to be easily combined. Using the same descriptors and equally weighting them from either the general or 
alternate assessment does not, however, resolve the issues (discussed below) inherent in aggregating scores from 
very different assessments. Of the states that have determined the descriptors, about half have chosen descriptors 
that are the same as those used in the general assessment, and half have chosen different descriptors (Thompson & 
Thurlow, 2001). 

Bechard (2001) discussed the pros and cons of different models currently in use or proposed by various 
states to report assessment data. Aggregating the data, reporting the scores of all assessments (general and alternate) 
together, allows students who participate in alternate assessment to “count” as much as those who use the general 
assessment, and perhaps compels schools to place the same level of importance on the results of the alternate 
assessments as on the general assessments. However, questions of statistical soundness are raised by the aggregation 
of results from what may be very different types of assessments, as well as a danger of the scores of this very small 
part of the total group (less than two per cent) being overlooked upon aggregation. If high stakes are attached to 
assessment results, a district may be more inclined to put resources into improving the scores of the larger number of 
students using the general assessment rather than into improving the scores of students with severe disabilities, 
which represent less than two per cent of the total scores. In response to these issues, some states have decided to 
report the scores in both aggregated and disaggregated forms. That is, they will report combined results that include 
both assessments and also report results of each of the assessments separately. This model places equal value on 
both types of assessment and communicates more information overall, but does not address the issue of the 
appropriateness of aggregating scores from different assessments. Another approach is to keep the general and 
alternate assessments completely separate. This approach avoids the statistical soundness issue by not aggregating 
scores, but reporting such a small number of alternate assessment scores separately may cause them to be 
overshadowed by the results of the general assessment. Thirty states currently have a system in place to report 
alternate assessments; ten of these report results aggregated with general assessment scores, and twenty report 
alternate assessment scores separately (Thompson & Thurlow, 2001). Many states are working on a process which 
will aggregate the scores of alternate assessments with general assessment scores, and also resolve the statistical 
issues such aggregation raises. 

The 1997 Reauthorization of IDEA required states to develop alternate assessments that would allow even 
students with the most severe disabilities to participate in statewide testing. After determining eligibility for 
participation in the alternate assessments, states initially had to decide from what standards they would develop the 
assessment. Most states report expanding the general standards to include ftinctional skills. A number of different 
strategies were then used to develop the alternate assessments, including portfolios (work samples, anecdotal 
records, videotape, etc.), checklists/rating scales, and lEP analysis. One of the least defined areas of alternate 



271 



6 



assessment remains reporting of scores. IDEA 97 mandated reporting, but did not offer specific guidance. States that 
have determined a method of reporting are usually reporting alternate assessments separately from the general 
assessments, but are working on a process that would allow all scores to be aggregated. It will be important to 
monitor the process of alternate assessment development and implementation to ensure that the educational needs of 
students with severe disabilities are included in the overall evaluations of school effectiveness. 
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